12. Order of Operations in Text Processing
Order of Operations in Text Processing
Question:
Start Quiz:

Solution:
INSTRUCTOR NOTE:
Here's an example that might help if this is all a little abstract:
Suppose that the text in question is "responsibility is responsive to responsible people" (ok, this doesn't make sense as a sentence, but you know what I mean…)
If you put into bag of words straightaway, you get something like
[is:1
people: 1
responsibility: 1
responsive: 1
responsible:1]
and then applying stemming gives you
[is:1
people:1
respon:1
respon:1
respon:1]
(if you can even find a way to stem the count vectorizer object in sklearn, the most likely outcome of trying would just be that your code would crash…)
Then you would need another post-processing step to get to the following bag of words, which is what you'd get straightaway if you stemmed first:
[is:1
people:1
respon:3]
Obviously the second is probably the one you want, so stemming first gets you the right answer here.